Explorized Policy Iteration For Continuous-Time Linear Systems

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integral Q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systems

This paper proposes an integral Q-learning for continuous-time (CT) linear time-invariant (LTI) systems, which solves a linear quadratic regulation (LQR) problem in real time for a given system and a value function, without knowledge about the system dynamics A and B. Here, Q-learning is referred to as a family of reinforcement learning methods which find the optimal policy by interaction with ...

متن کامل

Adaptive optimal control for continuous-time linear systems based on policy iteration

In this paper we propose a new scheme based on adaptive critics for finding online the state feedback, infinite horizon, optimal control solution of linear continuous-time systems using only partial knowledge regarding the system dynamics. In other words, the algorithm solves online an algebraic Riccati equation without knowing the internal dynamics model of the system. Being based on a policy ...

متن کامل

On integral generalized policy iteration for continuous-time linear quadratic regulations

This paper mathematically analyzes the integral generalized policy iteration (I-GPI) algorithms applied to a class of continuous-time linear quadratic regulation (LQR) problems with the unknown system matrix A. GPI is the general idea of interacting policy evaluation and policy improvement steps of policy iteration (PI), for computing the optimal policy. We first introduce the update horizon },...

متن کامل

Online Adaptive Optimal Control for Continuous-Time Markov Jump Linear Systems Using A Novel Policy Iteration Algorithm∗

This paper studies the online adaptive optimal control problems for a class of continuous-time Markov Jump Linear Systems (MJLSs) based on a novel policy iteration algorithm. By utilizing a new decoupling technique named Subsystems Transformation, we re-construct the MJLSs and a set of new coupled systems composed of N subsystems are obtained. The online policy iteration algorithm was used to s...

متن کامل

Batch Policy Iteration Algorithms for Continuous Domains

This paper establishes the link between an adaptation of the policy iteration method for Markov decision processes with continuous state and action spaces and the policy gradient method when the differentiation of the mean value is directly done over the policy without parameterization. This approach allows deriving sound and practical batch Reinforcement Learning algorithms for continuous stat...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: The Transactions of The Korean Institute of Electrical Engineers

سال: 2012

ISSN: 1975-8359

DOI: 10.5370/kiee.2012.61.3.451